AITopics | human activity

22c16986b2f50af520f56dc34d91e403-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsApr-25-2026, 02:06:14 GMT

artificial intelligence, machine learning, natural language, (13 more...)

Neural Information Processing Systems

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Add feedback

Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions

Neural Information Processing SystemsMar-22-2026, 16:04:28 GMT

Vision-and-Language Navigation (VLN) aims to develop embodied agents that navigate based on human instructions. However, current VLN frameworks often rely on static environments and optimal expert supervision, limiting their real-world applicability. To address this, we introduce Human-Aware Vision-and-Language Navigation (HA-VLN), extending traditional VLN by incorporating dynamic human activities and relaxing key assumptions. We propose the Human-Aware 3D (HA3D) simulator, which combines dynamic human activities with the Matterport3D dataset, and the Human-Aware Room-to-Room (HA-R2R) dataset, extending R2R with human activity descriptions. To tackle HA-VLN challenges, we present the Expert-Supervised Cross-Modal (VLN-CM) and Non-Expert-Supervised Decision Transformer (VLN-DT) agents, utilizing cross-modal fusion and diverse training strategies for effective navigation in dynamic human environments. A comprehensive evaluation, including metrics considering human activities, and systematic analysis of HA-VLN's unique challenges, underscores the need for further research to enhance HA-VLN agents' real-world robustness and adaptability. Ultimately, this work provides benchmarks and insights for future research on embodied AI and Sim2Real transfer, paving the way for more realistic and applicable VLN systems in human-populated environments.

artificial intelligence, proceedings, vision-and-language navigation, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.39)

Add feedback

Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions Heng Li

Neural Information Processing SystemsFeb-18-2026, 08:05:06 GMT

Vision-and-Language Navigation (VLN) aims to develop embodied agents that navigate based on human instructions. However, current VLN frameworks often rely on static environments and optimal expert supervision, limiting their real-world applicability. To address this, we introduce Human-Aware Vision-and-Language Navigation (HA-VLN), extending traditional VLN by incorporating dynamic human activities and relaxing key assumptions.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.92)
Asia > India (0.04)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

c60468eca9cd0b0083f0ff9d0aeb171a-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-17-2026, 00:37:55 GMT

artificial intelligence, machine learning, radar, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Communications (0.69)
Information Technology > Artificial Intelligence > Robots (0.68)

Add feedback

Supplementary for Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning

Neural Information Processing SystemsFeb-12-2026, 11:37:18 GMT

Xiaoqian Wu Shanghai Jiao Tong University enlighten@sjtu.edu.cn In Tab. 1, we conclude the notations in this work for clarity.Notation Definition r A rule. The size of the premise symbols set M . S is the symbol set, and R is the rule set. A \ B The set difference of A and B. D A very large-scale activity images database.

large language model, natural language, symbolic system, (19 more...)

Neural Information Processing Systems

Country: Asia > China > Shanghai > Shanghai (0.25)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

22c16986b2f50af520f56dc34d91e403-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-7-2026, 21:42:35 GMT

computer vision, proceedings, recognition, (10 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.43)

Add feedback

Pigs have been island hopping for 50,000 years

Popular ScienceJan-5-2026, 15:53:00 GMT

With human help, the mammals can defy'the world's most fundamental natural boundaries.' Breakthroughs, discoveries, and DIY tips sent every weekday. Despite not exactly being world-renowned swimmers, pigs have spread across the Asia-Pacific region for thousands of years . With the genetic and archeological data from over 700 pigs, a team of scientists documented how people helped the mammals make their way across thousands of miles. "This research reveals what happens when people transport animals enormous distances, across one of the world's most fundamental natural boundaries," evolutionary geneticist and study co-author author Dr. David Stanton of the University of Cardiff and Queen Mary University of London said in a statement. "These movements led to pigs with a melting pot of ancestries. These patterns were technically very difficult to disentangle, but have ultimately helped us understand how and why animals came to be distributed across the Pacific islands."

andrew paul, island, wallace line, (14 more...)

Popular Science

Country:

Asia > Southeast Asia (0.06)
Oceania > Vanuatu (0.05)
South America > Brazil (0.05)
(14 more...)

Genre: Research Report > New Finding (0.90)

Technology: Information Technology > Artificial Intelligence (0.36)

Add feedback

ActionSense: A Multimodal Dataset and Recording Framework for Human Activities Using Wearable Sensors in a Kitchen Environment

Neural Information Processing SystemsDec-24-2025, 06:27:15 GMT

This paper introduces ActionSense, a multimodal dataset and recording framework with an emphasis on wearable sensing in a kitchen environment. It provides rich, synchronized data streams along with ground truth data to facilitate learning pipelines that could extract insights about how humans interact with the physical world during activities of daily living, and help lead to more capable and collaborative robot assistants. The wearable sensing suite captures motion, force, and attention information; it includes eye tracking with a first-person camera, forearm muscle activity sensors, a body-tracking system using 17 inertial sensors, finger-tracking gloves, and custom tactile sensors on the hands that use a matrix of conductive threads. This is coupled with activity labels and with externally-captured data from multiple RGB cameras, a depth camera, and microphones. The specific tasks recorded in ActionSense are designed to highlight lower-level physical skills and higher-level scene reasoning or action planning. They include simple object manipulations (e.g., stacking plates), dexterous actions (e.g., peeling or cutting vegetables), and complex action sequences (e.g., setting a table or loading a dishwasher).

actionsense, multimodal dataset and recording framework, wearable sensor, (7 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.07)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)

Add feedback

MOMA-LRG: Language-Refined Graphs for Multi-Object Multi-Actor Activity Parsing

Neural Information Processing SystemsDec-23-2025, 21:53:03 GMT

Video-language models (VLMs), large models pre-trained on numerous but noisy video-text pairs from the internet, have revolutionized activity recognition through their remarkable generalization and open-vocabulary capabilities. While complex human activities are often hierarchical and compositional, most existing tasks for evaluating VLMs focus only on high-level video understanding, making it difficult to accurately assess and interpret the ability of VLMs to understand complex and fine-grained human activities. Inspired by the recently proposed MOMA framework, we define activity graphs as a single universal representation of human activities that encompasses video understanding at the activity, sub-activity, and atomic action level. We redefine activity parsing as the overarching task of activity graph generation, requiring understanding human activities across all three levels. To facilitate the evaluation of models on activity parsing, we introduce MOMA-LRG (Multi-Object Multi-Actor Language-Refined Graphs), a large dataset of complex human activities with activity graph annotations that can be readily transformed into natural language sentences. Lastly, we present a model-agnostic and lightweight approach to adapting and evaluating VLMs by incorporating structured knowledge from activity graphs into VLMs, addressing the individual limitations of language and graphical models. We demonstrate strong performance on few-shot activity parsing, and our framework is intended to foster future research in the joint modeling of videos, graphs, and language.

human activity, language-refined graph, multi-object multi-actor activity parsing, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions Heng Li

Neural Information Processing SystemsOct-10-2025, 18:16:35 GMT

Vision-and-Language Navigation (VLN) aims to develop embodied agents that navigate based on human instructions. However, current VLN frameworks often rely on static environments and optimal expert supervision, limiting their real-world applicability. To address this, we introduce Human-Aware Vision-and-Language Navigation (HA-VLN), extending traditional VLN by incorporating dynamic human activities and relaxing key assumptions.

agent, instruction, navigation, (13 more...)

Neural Information Processing Systems

Country:

North America > United States (0.92)
Asia > India (0.04)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Filters

Collaborating Authors

human activity

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

22c16986b2f50af520f56dc34d91e403-Paper-Datasets_and_Benchmarks.pdf

Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions

Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions Heng Li

c60468eca9cd0b0083f0ff9d0aeb171a-Paper-Datasets_and_Benchmarks.pdf

Supplementary for Symbol-LLM: Leverage Language Models for Symbolic System in Visual Human Activity Reasoning

22c16986b2f50af520f56dc34d91e403-Paper-Datasets_and_Benchmarks.pdf

Pigs have been island hopping for 50,000 years

ActionSense: A Multimodal Dataset and Recording Framework for Human Activities Using Wearable Sensors in a Kitchen Environment

MOMA-LRG: Language-Refined Graphs for Multi-Object Multi-Actor Activity Parsing

Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions Heng Li